智能论文笔记

AI-based Monitoring and Response System for Hospital Preparedness towards COVID-19 in Southeast Asia

Tushar Goswamy , Naishadh Parmar , Ayush Gupta , Raunak Shah , Vatsalya Tandon , Varun Goyal , Sanyog Gupta , Karishma Laud , Shivam Gupta , Sudhanshu Mishra

分类：自然语言处理 | 机器学习

2020-07-30

这篇研究论文提出了COVID-19监测和响应系统，以确定医院患者的数量激增以及关键设备（如东南亚国家的呼吸机），以了解医疗机构的负担。这可以通过资源计划措施来帮助这些地区的当局，以将资源重定向到模型确定的地区。由于缺乏有关医院患者涌入的公开可用数据，或者这些国家可能面临的设备，ICU单元或医院病床的短缺，我们利用Twitter数据来收集此信息。该方法为印度的各州提供了准确的结果，我们正在努力验证其余国家的模型，以便它可以作为当局监控医院负担的可靠工具。

translated by 谷歌翻译

Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Harsh Trivedi , Niranjan Balasubramanian , Tushar Khot , Ashish Sabharwal

分类：自然语言处理

2022-12-20

Recent work has shown that large language models are capable of generating natural language reasoning steps or Chains-of-Thoughts (CoT) to answer a multi-step question when prompted to do so. This is insufficient, however, when the necessary knowledge is not available or up-to-date within a model's parameters. A straightforward approach to address this is to retrieve text from an external knowledge source using the question as a query and prepend it as context to the model's input. This, however, is also insufficient for multi-step QA where \textit{what to retrieve} depends on \textit{what has already been derived}. To address this issue we propose IRCoT, a new approach that interleaves retrieval with CoT for multi-step QA, guiding the retrieval with CoT and in turn using retrieved results to improve CoT. Our experiments with GPT3 show substantial improvements in retrieval (up to 22 points) and downstream QA (up to 16 points) over the baselines on four datasets: HotpotQA, 2WikiMultihopQA, MuSiQue, and IIRC. Notably, our method also works well for much smaller models such as T5-Flan-large (0.7B) without any additional training.

translated by 谷歌翻译

MrSARP: A Hierarchical Deep Generative Prior for SAR Image Super-resolution

Tushar Agarwal , Nithin Sugavanam , Emre Ertin

分类：计算机视觉 | 机器学习

2022-11-30

Generative models learned from training using deep learning methods can be used as priors in inverse under-determined inverse problems, including imaging from sparse set of measurements. In this paper, we present a novel hierarchical deep-generative model MrSARP for SAR imagery that can synthesize SAR images of a target at different resolutions jointly. MrSARP is trained in conjunction with a critic that scores multi resolution images jointly to decide if they are realistic images of a target at different resolutions. We show how this deep generative model can be used to retrieve the high spatial resolution image from low resolution images of the same target. The cost function of the generator is modified to improve its capability to retrieve the input parameters for a given set of resolution images. We evaluate the model's performance using the three standard error metrics used for evaluating super-resolution performance on simulated data and compare it to upsampling and sparsity based image sharpening approaches.

translated by 谷歌翻译

COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training

Divya Kiran Kadiyala , Saeed Rashidi , Taekyung Heo , Abhimanyu Rajeshkumar Bambhaniya , Tushar Krishna , Alexandros Daglis

分类：人工智能 | 机器学习

2022-11-30

Modern Deep Learning (DL) models have grown to sizes requiring massive clusters of specialized, high-end nodes to train. Designing such clusters to maximize both performance and utilization to amortize their steep cost is a challenging task requiring careful balance of compute, memory, and network resources. Moreover, a plethora of each model's tuning knobs drastically affect the performance, with optimal values often depending on the underlying cluster's characteristics, which necessitates a complex cluster-workload co-design process. To facilitate the design space exploration of such massive DL training clusters, we introduce COMET a holistic cluster design methodology and workflow to jointly study the impact of parallelization strategies and key cluster resource provisioning on the performance of distributed DL training. We develop a step-by-step process to establish a reusable and flexible methodology, and demonstrate its application with a case study of training a Transformer-1T model on a cluster of variable compute, memory, and network resources. Our case study demonstrates COMET's utility in identifying promising architectural optimization directions and guiding system designers in configuring key model and cluster parameters.

translated by 谷歌翻译

XF2T: Cross-lingual Fact-to-Text Generation for Low-Resource Languages

Shivprasad Sagare , Tushar Abhishek , Bhavyajeet Singh , Anubhav Sharma , Manish Gupta , Vasudeva Varma

分类：自然语言处理

2022-09-22

多种业务场景需要从结构化输入数据中自动生成描述性的人类可读文本。因此，已经开发了针对各种下游任务的事实到文本的系统主要是由于相关数据集的高可用性。直到最近，提出了跨语言事实与文本（XF2T）的问题，该问题是针对多种语言的生成，以及一个数据集，Xalign的八种语言。但是，实际上XF2T生成问题没有严格的工作。我们使用另外四种语言的注释数据扩展了Xalign数据集：旁遮普语，马拉雅拉姆语，阿萨姆语和Oriya。我们在扩展的多语言数据集上使用基于变压器的流行文本生成模型进行了广泛的研究，我们称之为Xalignv2。此外，我们研究了不同文本生成策略的性能：预处理，事实感知的嵌入和结构意识的输入编码的多种变化。我们的广泛实验表明，使用具有结构意识的输入编码的事实感知的嵌入式的多语言MT5模型可以平均在十二种语言中获得最佳结果。我们将代码，数据集和模型公开可用，并希望这将有助于进一步在此关键领域进行进一步的研究。

translated by 谷歌翻译

Deep Physics Corrector: A physics enhanced deep learning architecture for solving stochastic differential equations

Tushar , Souvik Chakraborty

分类： (统计)机器学习 | 机器学习

2022-09-20

我们为由随机微分方程（SDE）控制的物理系统提出了一种新型的灰色盒建模算法。所提出的方法（称为深物理校正器（DPC））将用SDE代表的物理学与深神经网络（DNN）相结合。这里的主要思想是利用DNN来建模缺失的物理学。我们假设将不完整的物理与数据相结合将使模型可解释并允许更好地概括。与随机模拟器的训练替代模型相关的主要瓶颈通常与选择合适的损耗函数有关。在文献中可用的不同损失函数中，我们在DPC中使用有条件的最大平均差异（CMMD）损失函数，因为其证明了其性能。总体而言，物理数据融合和CMMD允许DPC从稀疏数据中学习。我们说明了拟议的DPC在文献中的四个基准示例上的性能。获得的结果高度准确，表明它可能将其作为随机模拟器的替代模型的应用。

translated by 谷歌翻译

An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning

Tushar Talukder Showrav

分类：自然语言处理

2022-09-16

一种独立的自动解码和转录口服语音方法称为自动语音识别（ASR）。典型的ASR系统提取物从音频录制或流中列出，并运行一种或多种算法以将功能映射到相应的文本。近年来，在语音信号处理领域进行了许多研究。当获得足够的资源时，常规的ASR和新兴的端到端（E2E）语音识别都产生了有希望的结果。但是，对于像孟加拉这样的低资源语言，ASR的当前状态落后于落后，尽管低资源状态并没有反映出这一语言是全世界有超过5亿人使用的。尽管它很受欢迎，但并没有很多可用的开源数据集，因此很难对孟加拉语语音识别系统进行研究。本文是名为“ Buet CSE Fest DL Sprint”的比赛的一部分。本文的目的是通过基于转移学习框架在E2E结构上采用语音识别技术来提高孟加拉语的语音识别表现。提出的方法有效地对孟加拉语语言进行了建模，并在7747个样本的测试数据集上以“ Levenshtein平均距离”获得3.819分数，而仅使用1000个火车数据集样本进行培训。

translated by 谷歌翻译

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Sheng-Chun Kao , Amir Yazdanbakhsh , Suvinay Subramanian , Shivani Agrawal , Utku Evci , Tushar Krishna

分类：机器学习 | 人工智能

2022-09-15

稀疏性已成为压缩和加速深度神经网络（DNN）的有前途方法之一。在不同类别的稀疏性中，由于其对现代加速器的有效执行，结构化的稀疏性引起了人们的关注。特别是，n：m稀疏性很有吸引力，因为已经有一些硬件加速器架构可以利用某些形式的n：m结构化稀疏性来产生更高的计算效率。在这项工作中，我们专注于N：M的稀疏性，并广泛研究和评估N：M稀疏性的各种培训食谱，以模型准确性和计算成本（FLOPS）之间的权衡（FLOPS）。在这项研究的基础上，我们提出了两种新的基于衰减的修剪方法，即“修剪面膜衰减”和“稀疏结构衰减”。我们的评估表明，这些提出的方法始终提供最新的（SOTA）模型精度，可与非结构化的稀疏性相当，在基于变压器的模型上用于翻译任务。使用新培训配方的稀疏模型准确性的提高是以总训练计算（FLOP）边际增加的成本。

translated by 谷歌翻译

A Novel Multi-Task Learning Approach for Context-Sensitive Compound Type Identification in Sanskrit

Jivnesh Sandhan , Ashish Gupta , Hrishikesh Terdalkar , Tushar Sandhan , Suvendu Samanta , Laxmidhar Behera , Pawan Goyal

分类：自然语言处理

2022-08-22

复合现象在梵语中无处不在。它用于表达思想的简洁性，同时丰富语言的词汇和结构形成。在这项工作中，我们专注于梵语复合类型标识（SACTI）任务，在其中我们考虑了识别复合词组件之间语义关系的问题。早期的方法仅依赖于从组件获得的词汇信息，而忽略最关键的上下文和句法信息，对SACTI有用。但是，SACTI任务主要是由于化合物组件之间隐式编码的上下文敏感语义关系。因此，我们提出了一种新颖的多任务学习体系结构，该体系结构结合了上下文信息，并使用形态标记和依赖性解析作为两个辅助任务来丰富互补的句法信息。与最新系统相比，SACTI基准数据集上的实验显示了6.1分（准确性）和7.7点（F1得分）绝对增益。此外，我们的多语言实验证明了拟议的架构在英语和马拉地语中的功效。代码和数据集可在https://github.com/ashishgupta2598/sacti上公开获得。

translated by 谷歌翻译

Persuasion Strategies in Advertisements: Dataset, Modeling, and Baselines

Yaman Kumar Singla , Rajat Jha , Arunim Gupta , Milan Aggarwal , Aditya Garg , Ayush Bhardwaj , Tushar , Balaji Krishnamurthy , Rajiv Ratn Shah , Changyou Chen

分类：自然语言处理 | 计算机视觉

2022-08-20

建模是什么使广告有说服力的原因，即引起消费者的所需响应，对于宣传，社会心理学和营销的研究至关重要。尽管其重要性，但计算机视觉中说服力的计算建模仍处于起步阶段，这主要是由于缺乏可以提供与ADS相关的说服力标签的基准数据集。由社会心理学和市场营销中的说服文学的激励，我们引入了广泛的说服策略词汇，并建立了用说服策略注释的第一个AD图像语料库。然后，我们通过多模式学习制定说服策略预测的任务，在该任务中，我们设计了一个多任务注意融合模型，该模型可以利用其他广告理解的任务来预测说服策略。此外，我们对30家财富500家公司的1600个广告活动进行了真实的案例研究，我们使用模型的预测来分析哪些策略与不同的人口统计学（年龄和性别）一起使用。该数据集还提供图像分割掩码，该蒙版在测试拆分上标记了相应的AD图像中的说服力策略。我们公开发布代码和数据集https://midas-research.github.io/persuasion-avertisements/。

translated by 谷歌翻译